Overview

Dataset statistics

Number of variables14
Number of observations1998
Missing cells6208
Missing cells (%)22.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory926.1 KiB
Average record size in memory474.6 B

Variable types

NUM8
CAT6

Warnings

proyecto has a high cardinality: 101 distinct values High cardinality
primiparos is highly correlated with admitidosHigh correlation
admitidos is highly correlated with primiparosHigh correlation
anno_semestre is highly correlated with semestreHigh correlation
semestre is highly correlated with anno_semestreHigh correlation
modalidad is highly correlated with nivelHigh correlation
nivel is highly correlated with modalidadHigh correlation
inscritos has 1243 (62.2%) missing values Missing
admitidos has 783 (39.2%) missing values Missing
primiparos has 817 (40.9%) missing values Missing
matriculados has 598 (29.9%) missing values Missing
egresados has 888 (44.4%) missing values Missing
graduados has 831 (41.6%) missing values Missing
retirados has 1030 (51.6%) missing values Missing
semestre is uniformly distributed Uniform
anno_semestre is uniformly distributed Uniform

Reproduction

Analysis started2020-12-13 02:04:04.623107
Analysis finished2020-12-13 02:04:11.161733
Duration6.54 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

anno
Real number (ℝ≥0)

Distinct9
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2013
Minimum2009
Maximum2017
Zeros0
Zeros (%)0.0%
Memory size15.7 KiB
2020-12-12T21:04:11.231794image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum2009
5-th percentile2009
Q12011
median2013
Q32015
95-th percentile2017
Maximum2017
Range8
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.582635283
Coefficient of variation (CV)0.001282978283
Kurtosis-1.230074608
Mean2013
Median Absolute Deviation (MAD)2
Skewness0
Sum4021974
Variance6.670005008
MonotocityNot monotonic
2020-12-12T21:04:11.301354image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
201722211.1%
 
201622211.1%
 
201522211.1%
 
201422211.1%
 
201322211.1%
 
201222211.1%
 
201122211.1%
 
201022211.1%
 
200922211.1%
 
ValueCountFrequency (%) 
200922211.1%
 
201022211.1%
 
201122211.1%
 
201222211.1%
 
201322211.1%
 
201422211.1%
 
201522211.1%
 
201622211.1%
 
201722211.1%
 
ValueCountFrequency (%) 
201722211.1%
 
201622211.1%
 
201522211.1%
 
201422211.1%
 
201322211.1%
 
201222211.1%
 
201122211.1%
 
201022211.1%
 
200922211.1%
 

semestre
Categorical

HIGH CORRELATION
UNIFORM

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.7 KiB
3
999 
1
999 
ValueCountFrequency (%) 
399950.0%
 
199950.0%
 
2020-12-12T21:04:11.382424image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T21:04:11.430965image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:11.482009image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters2
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
199950.0%
 
399950.0%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number1998100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
199950.0%
 
399950.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Common1998100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
199950.0%
 
399950.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1998100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
199950.0%
 
399950.0%
 

anno_semestre
Categorical

HIGH CORRELATION
UNIFORM

Distinct18
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size15.7 KiB
03/01/2010 12:00:00 AM
 
111
03/01/2016 12:00:00 AM
 
111
01/01/2013 12:00:00 AM
 
111
01/01/2015 12:00:00 AM
 
111
01/01/2012 12:00:00 AM
 
111
Other values (13)
1443 
ValueCountFrequency (%) 
03/01/2010 12:00:00 AM1115.6%
 
03/01/2016 12:00:00 AM1115.6%
 
01/01/2013 12:00:00 AM1115.6%
 
01/01/2015 12:00:00 AM1115.6%
 
01/01/2012 12:00:00 AM1115.6%
 
03/01/2011 12:00:00 AM1115.6%
 
03/01/2012 12:00:00 AM1115.6%
 
01/01/2017 12:00:00 AM1115.6%
 
01/01/2016 12:00:00 AM1115.6%
 
01/01/2014 12:00:00 AM1115.6%
 
03/01/2009 12:00:00 AM1115.6%
 
03/01/2015 12:00:00 AM1115.6%
 
01/01/2010 12:00:00 AM1115.6%
 
03/01/2013 12:00:00 AM1115.6%
 
01/01/2009 12:00:00 AM1115.6%
 
03/01/2017 12:00:00 AM1115.6%
 
03/01/2014 12:00:00 AM1115.6%
 
01/01/2011 12:00:00 AM1115.6%
 
2020-12-12T21:04:11.561578image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T21:04:11.636142image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length22
Median length22
Mean length22
Min length22

Overview of Unicode Properties

Unique unicode characters14
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
01443032.8%
 
1699315.9%
 
242189.6%
 
/39969.1%
 
39969.1%
 
:39969.1%
 
A19984.5%
 
M19984.5%
 
312212.8%
 
92220.5%
 
42220.5%
 
52220.5%
 
62220.5%
 
72220.5%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number2797263.6%
 
Other Punctuation799218.2%
 
Space Separator39969.1%
 
Uppercase Letter39969.1%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
01443051.6%
 
1699325.0%
 
2421815.1%
 
312214.4%
 
92220.8%
 
42220.8%
 
52220.8%
 
62220.8%
 
72220.8%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
/399650.0%
 
:399650.0%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
3996100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
A199850.0%
 
M199850.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Common3996090.9%
 
Latin39969.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
01443036.1%
 
1699317.5%
 
2421810.6%
 
/399610.0%
 
399610.0%
 
:399610.0%
 
312213.1%
 
92220.6%
 
42220.6%
 
52220.6%
 
62220.6%
 
72220.6%
 

Most frequent Latin characters

ValueCountFrequency (%) 
A199850.0%
 
M199850.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII43956100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
01443032.8%
 
1699315.9%
 
242189.6%
 
/39969.1%
 
39969.1%
 
:39969.1%
 
A19984.5%
 
M19984.5%
 
312212.8%
 
92220.5%
 
42220.5%
 
52220.5%
 
62220.5%
 
72220.5%
 

facultad
Categorical

Distinct7
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size15.7 KiB
FACULTAD DE CIENCIAS Y EDUCACION
630 
FACULTAD DE TECNOLOGIA - POLITECNICA / TECNOLOGICA
540 
FACULTAD DE INGENIERIA
414 
FACULTAD DE MEDIO AMBIENTE Y RECURSOS NATURALES
270 
FACULTAD DE ARTES-ASAB
108 
Other values (2)
 
36
ValueCountFrequency (%) 
FACULTAD DE CIENCIAS Y EDUCACION63031.5%
 
FACULTAD DE TECNOLOGIA - POLITECNICA / TECNOLOGICA54027.0%
 
FACULTAD DE INGENIERIA41420.7%
 
FACULTAD DE MEDIO AMBIENTE Y RECURSOS NATURALES27013.5%
 
FACULTAD DE ARTES-ASAB1085.4%
 
VICERRECTORIA CONVENIOS180.9%
 
VICERRECTORIA ACADEMICA180.9%
 
2020-12-12T21:04:11.713709image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T21:04:11.768256image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:11.850326image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length50
Median length32
Mean length36.11711712
Min length22

Overview of Unicode Properties

Unique unicode characters22
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
846011.7%
 
A844211.7%
 
C757810.5%
 
E723610.0%
 
I59408.2%
 
D48426.7%
 
N42845.9%
 
T42665.9%
 
O39425.5%
 
L38525.3%
 
U31324.3%
 
F19622.7%
 
S16742.3%
 
G14942.1%
 
R14402.0%
 
Y9001.2%
 
-6480.9%
 
M5580.8%
 
P5400.7%
 
/5400.7%
 
B3780.5%
 
V540.1%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter6251486.6%
 
Space Separator846011.7%
 
Dash Punctuation6480.9%
 
Other Punctuation5400.7%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
A844213.5%
 
C757812.1%
 
E723611.6%
 
I59409.5%
 
D48427.7%
 
N42846.9%
 
T42666.8%
 
O39426.3%
 
L38526.2%
 
U31325.0%
 
F19623.1%
 
S16742.7%
 
G14942.4%
 
R14402.3%
 
Y9001.4%
 
M5580.9%
 
P5400.9%
 
B3780.6%
 
V540.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
8460100.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-648100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
/540100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin6251486.6%
 
Common964813.4%
 

Most frequent Latin characters

ValueCountFrequency (%) 
A844213.5%
 
C757812.1%
 
E723611.6%
 
I59409.5%
 
D48427.7%
 
N42846.9%
 
T42666.8%
 
O39426.3%
 
L38526.2%
 
U31325.0%
 
F19623.1%
 
S16742.7%
 
G14942.4%
 
R14402.3%
 
Y9001.4%
 
M5580.9%
 
P5400.9%
 
B3780.6%
 
V540.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
846087.7%
 
-6486.7%
 
/5405.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII72162100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
846011.7%
 
A844211.7%
 
C757810.5%
 
E723610.0%
 
I59408.2%
 
D48426.7%
 
N42845.9%
 
T42665.9%
 
O39425.5%
 
L38525.3%
 
U31324.3%
 
F19622.7%
 
S16742.3%
 
G14942.1%
 
R14402.0%
 
Y9001.2%
 
-6480.9%
 
M5580.8%
 
P5400.7%
 
/5400.7%
 
B3780.5%
 
V540.1%
 

proyecto
Categorical

HIGH CARDINALITY

Distinct101
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Memory size15.7 KiB
INGENIERIA ELECTRICA
 
54
NO_REGISTRA
 
54
INGENIERIA EN TELEMATICA
 
36
INGENIERIA EN CONTROL
 
36
INGENIERA CIVIL
 
36
Other values (96)
1782 
ValueCountFrequency (%) 
INGENIERIA ELECTRICA542.7%
 
NO_REGISTRA542.7%
 
INGENIERIA EN TELEMATICA361.8%
 
INGENIERIA EN CONTROL361.8%
 
INGENIERA CIVIL361.8%
 
ARTES ESCENICAS361.8%
 
INGENIERIA DE PRODUCCION361.8%
 
INGENIERIA MECANICA361.8%
 
LICENCIATURA EN EDUCACION PRIMARIA180.9%
 
MAESTRIA EN INGENIERIA CIVIL180.9%
 
MAESTRIA EN LINGUISTICA APLICADA A LA ENSENANZA DEL INGLES180.9%
 
TECNOLOGIA EN SANEAMIENTO AMBIENTAL180.9%
 
TECNOLOGIA EN ADMINISTRACION DEPORTIVA180.9%
 
LICENCIATURA EN QUIMICA180.9%
 
ESPECIALIZACION EN INFANCIA, CULTURA Y DESARROLLO180.9%
 
INGENIERIA ELECTRONICA180.9%
 
INGENIERIA SANITARIA180.9%
 
ESPECIALIZACION EN METODOLOGIA Y APRENDIZAJE DEL ESPANOL COMO LENGUA MATERNA180.9%
 
ESPECIALIZACION TECNOLOGICA EN CONTROL ELECTRONICO E INSTRUMENTACION180.9%
 
TECNOLOGIA EN ELECTRONICA180.9%
 
ESPECIALIZACION EN EDUCACION EN TECNOLOGIA180.9%
 
MATEMATICAS180.9%
 
ESPECIALIZACION EN VOZ ESCENICA180.9%
 
INGENIERIA EN CONTROL ELECTRONICO E INSTRUMENTACION180.9%
 
LICENCIATURA EN FISICA180.9%
 
Other values (76)136868.5%
 
2020-12-12T21:04:11.944407image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T21:04:12.045995image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length84
Median length35
Mean length37.0990991
Min length11

Overview of Unicode Properties

Unique unicode characters29
Unique unicode categories5 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
E909012.3%
 
I903612.2%
 
A779410.5%
 
71829.7%
 
N70389.5%
 
C54907.4%
 
O46086.2%
 
S38705.2%
 
T34744.7%
 
R31504.2%
 
L30784.2%
 
D17642.4%
 
G17462.4%
 
U15122.0%
 
M14041.9%
 
P11521.6%
 
Z7201.0%
 
Y5220.7%
 
V3960.5%
 
F3600.5%
 
B3240.4%
 
J1080.1%
 
,720.1%
 
H540.1%
 
_540.1%
 
Other values (4)1260.2%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter6678090.1%
 
Space Separator71829.7%
 
Other Punctuation720.1%
 
Connector Punctuation540.1%
 
Dash Punctuation36< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
E909013.6%
 
I903613.5%
 
A779411.7%
 
N703810.5%
 
C54908.2%
 
O46086.9%
 
S38705.8%
 
T34745.2%
 
R31504.7%
 
L30784.6%
 
D17642.6%
 
G17462.6%
 
U15122.3%
 
M14042.1%
 
P11521.7%
 
Z7201.1%
 
Y5220.8%
 
V3960.6%
 
F3600.5%
 
B3240.5%
 
J1080.2%
 
H540.1%
 
X360.1%
 
Q360.1%
 
W18< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
7182100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
,72100.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-36100.0%
 

Most frequent Connector Punctuation characters

ValueCountFrequency (%) 
_54100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin6678090.1%
 
Common73449.9%
 

Most frequent Latin characters

ValueCountFrequency (%) 
E909013.6%
 
I903613.5%
 
A779411.7%
 
N703810.5%
 
C54908.2%
 
O46086.9%
 
S38705.8%
 
T34745.2%
 
R31504.7%
 
L30784.6%
 
D17642.6%
 
G17462.6%
 
U15122.3%
 
M14042.1%
 
P11521.7%
 
Z7201.1%
 
Y5220.8%
 
V3960.6%
 
F3600.5%
 
B3240.5%
 
J1080.2%
 
H540.1%
 
X360.1%
 
Q360.1%
 
W18< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
718297.8%
 
,721.0%
 
_540.7%
 
-360.5%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII74124100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
E909012.3%
 
I903612.2%
 
A779410.5%
 
71829.7%
 
N70389.5%
 
C54907.4%
 
O46086.2%
 
S38705.2%
 
T34744.7%
 
R31504.2%
 
L30784.2%
 
D17642.4%
 
G17462.4%
 
U15122.0%
 
M14041.9%
 
P11521.6%
 
Z7201.0%
 
Y5220.7%
 
V3960.5%
 
F3600.5%
 
B3240.4%
 
J1080.1%
 
,720.1%
 
H540.1%
 
_540.1%
 
Other values (4)1260.2%
 

nivel
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing18
Missing (%)0.9%
Memory size15.7 KiB
PREGRADO
1116 
POSGRADO
864 
ValueCountFrequency (%) 
PREGRADO111655.9%
 
POSGRADO86443.2%
 
(Missing)180.9%
 
2020-12-12T21:04:12.133570image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T21:04:12.177608image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:12.229653image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length8
Median length8
Mean length7.954954955
Min length3

Overview of Unicode Properties

Unique unicode characters10
Unique unicode categories2 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
R309619.5%
 
O284417.9%
 
P198012.5%
 
G198012.5%
 
A198012.5%
 
D198012.5%
 
E11167.0%
 
S8645.4%
 
n360.2%
 
a180.1%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter1584099.7%
 
Lowercase Letter540.3%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
R309619.5%
 
O284418.0%
 
P198012.5%
 
G198012.5%
 
A198012.5%
 
D198012.5%
 
E11167.0%
 
S8645.5%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n3666.7%
 
a1833.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin15894100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
R309619.5%
 
O284417.9%
 
P198012.5%
 
G198012.5%
 
A198012.5%
 
D198012.5%
 
E11167.0%
 
S8645.4%
 
n360.2%
 
a180.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII15894100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
R309619.5%
 
O284417.9%
 
P198012.5%
 
G198012.5%
 
A198012.5%
 
D198012.5%
 
E11167.0%
 
S8645.4%
 
n360.2%
 
a180.1%
 

modalidad
Categorical

HIGH CORRELATION

Distinct11
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size15.7 KiB
ESPECIALIZACION
468 
INGENIERIA
342 
TECNOLOGIA
306 
MAESTRIA
288 
LICENCIATURA
234 
Other values (6)
360 
ValueCountFrequency (%) 
ESPECIALIZACION46823.4%
 
INGENIERIA34217.1%
 
TECNOLOGIA30615.3%
 
MAESTRIA28814.4%
 
LICENCIATURA23411.7%
 
ARTES1085.4%
 
COMPONENTE PROPEDÉUTICO1085.4%
 
DOCTORADO542.7%
 
PROYECTO ACADEMICO361.8%
 
ADMINISTRACION361.8%
 
CICLO BASICO INGENIERIA180.9%
 
2020-12-12T21:04:12.306719image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-12T21:04:12.383785image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length24
Median length10
Mean length11.90990991
Min length5

Overview of Unicode Properties

Unique unicode characters21
Unique unicode categories3 ?
Unique unicode scripts2 ?
Unique unicode blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
I383416.1%
 
E298812.6%
 
A297012.5%
 
C21789.2%
 
N20168.5%
 
O18547.8%
 
T12785.4%
 
R12245.1%
 
L10264.3%
 
S9183.9%
 
P8283.5%
 
G6662.8%
 
M4682.0%
 
Z4682.0%
 
U3421.4%
 
D2881.2%
 
1800.8%
 
Ã1080.5%
 
‰1080.5%
 
Y360.2%
 
B180.1%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter2350898.8%
 
Space Separator1800.8%
 
Control1080.5%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
I383416.3%
 
E298812.7%
 
A297012.6%
 
C21789.3%
 
N20168.6%
 
O18547.9%
 
T12785.4%
 
R12245.2%
 
L10264.4%
 
S9183.9%
 
P8283.5%
 
G6662.8%
 
M4682.0%
 
Z4682.0%
 
U3421.5%
 
D2881.2%
 
Ã1080.5%
 
Y360.2%
 
B180.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
180100.0%
 

Most frequent Control characters

ValueCountFrequency (%) 
‰108100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin2350898.8%
 
Common2881.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
I383416.3%
 
E298812.7%
 
A297012.6%
 
C21789.3%
 
N20168.6%
 
O18547.9%
 
T12785.4%
 
R12245.2%
 
L10264.4%
 
S9183.9%
 
P8283.5%
 
G6662.8%
 
M4682.0%
 
Z4682.0%
 
U3421.5%
 
D2881.2%
 
Ã1080.5%
 
Y360.2%
 
B180.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
18062.5%
 
‰10837.5%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2358099.1%
 
None2160.9%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
I383416.3%
 
E298812.7%
 
A297012.6%
 
C21789.2%
 
N20168.5%
 
O18547.9%
 
T12785.4%
 
R12245.2%
 
L10264.4%
 
S9183.9%
 
P8283.5%
 
G6662.8%
 
M4682.0%
 
Z4682.0%
 
U3421.5%
 
D2881.2%
 
1800.8%
 
Y360.2%
 
B180.1%
 

Most frequent None characters

ValueCountFrequency (%) 
Ã10850.0%
 
‰10850.0%
 

inscritos
Real number (ℝ≥0)

MISSING

Distinct448
Distinct (%)59.3%
Missing1243
Missing (%)62.2%
Infinite0
Infinite (%)0.0%
Mean325.1324503
Minimum1
Maximum3171
Zeros0
Zeros (%)0.0%
Memory size15.7 KiB
2020-12-12T21:04:12.468859image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile22.7
Q1117
median218
Q3371.5
95-th percentile1000.1
Maximum3171
Range3170
Interquartile range (IQR)254.5

Descriptive statistics

Standard deviation369.3902474
Coefficient of variation (CV)1.136122362
Kurtosis13.15474964
Mean325.1324503
Median Absolute Deviation (MAD)113
Skewness3.124956221
Sum245475
Variance136449.1548
MonotocityNot monotonic
2020-12-12T21:04:12.555433image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1191.0%
 
10760.3%
 
21260.3%
 
15050.3%
 
23450.3%
 
28050.3%
 
12950.3%
 
14640.2%
 
18340.2%
 
9340.2%
 
7240.2%
 
10640.2%
 
21440.2%
 
6140.2%
 
12240.2%
 
16240.2%
 
20440.2%
 
10140.2%
 
23940.2%
 
8240.2%
 
9540.2%
 
13340.2%
 
11740.2%
 
27540.2%
 
13540.2%
 
Other values (423)63231.6%
 
(Missing)124362.2%
 
ValueCountFrequency (%) 
1191.0%
 
210.1%
 
420.1%
 
920.1%
 
1010.1%
 
1120.1%
 
1220.1%
 
1320.1%
 
1420.1%
 
1510.1%
 
ValueCountFrequency (%) 
317110.1%
 
288310.1%
 
221610.1%
 
209410.1%
 
208910.1%
 
206710.1%
 
202610.1%
 
194010.1%
 
186310.1%
 
183810.1%
 

admitidos
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct189
Distinct (%)15.6%
Missing783
Missing (%)39.2%
Infinite0
Infinite (%)0.0%
Mean69.45349794
Minimum1
Maximum315
Zeros0
Zeros (%)0.0%
Memory size15.7 KiB
2020-12-12T21:04:12.646011image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile11
Q130
median55
Q3102
95-th percentile161.3
Maximum315
Range314
Interquartile range (IQR)72

Descriptive statistics

Standard deviation48.54160669
Coefficient of variation (CV)0.6989080195
Kurtosis0.005195709193
Mean69.45349794
Median Absolute Deviation (MAD)35
Skewness0.7420157035
Sum84386
Variance2356.28758
MonotocityNot monotonic
2020-12-12T21:04:12.729583image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
37301.5%
 
36251.3%
 
20231.2%
 
40211.1%
 
16201.0%
 
38201.0%
 
24191.0%
 
13170.9%
 
10170.9%
 
42160.8%
 
23160.8%
 
22150.8%
 
21150.8%
 
39150.8%
 
35150.8%
 
41140.7%
 
14140.7%
 
99140.7%
 
96140.7%
 
12140.7%
 
15130.7%
 
44130.7%
 
105130.7%
 
34130.7%
 
100130.7%
 
Other values (164)79639.8%
 
(Missing)78339.2%
 
ValueCountFrequency (%) 
160.3%
 
220.1%
 
350.3%
 
430.2%
 
540.2%
 
640.2%
 
730.2%
 
870.4%
 
970.4%
 
10170.9%
 
ValueCountFrequency (%) 
31510.1%
 
23510.1%
 
22710.1%
 
21110.1%
 
21010.1%
 
19710.1%
 
19430.2%
 
19210.1%
 
19030.2%
 
18810.1%
 

primiparos
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct153
Distinct (%)13.0%
Missing817
Missing (%)40.9%
Infinite0
Infinite (%)0.0%
Mean57.72904318
Minimum1
Maximum260
Zeros0
Zeros (%)0.0%
Memory size15.7 KiB
2020-12-12T21:04:12.818159image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile11
Q126
median51
Q380
95-th percentile126
Maximum260
Range259
Interquartile range (IQR)54

Descriptive statistics

Standard deviation37.25042707
Coefficient of variation (CV)0.6452631988
Kurtosis0.3530010113
Mean57.72904318
Median Absolute Deviation (MAD)27
Skewness0.7444055944
Sum68178
Variance1387.594317
MonotocityNot monotonic
2020-12-12T21:04:12.906735image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
37281.4%
 
78251.3%
 
36241.2%
 
35241.2%
 
38221.1%
 
16221.1%
 
25221.1%
 
20211.1%
 
15201.0%
 
42201.0%
 
80201.0%
 
19201.0%
 
83191.0%
 
39191.0%
 
81191.0%
 
76191.0%
 
40180.9%
 
22180.9%
 
73170.9%
 
79170.9%
 
82170.9%
 
17160.8%
 
60160.8%
 
43160.8%
 
1150.8%
 
Other values (128)68734.4%
 
(Missing)81740.9%
 
ValueCountFrequency (%) 
1150.8%
 
260.3%
 
360.3%
 
430.2%
 
540.2%
 
630.2%
 
710.1%
 
840.2%
 
960.3%
 
1060.3%
 
ValueCountFrequency (%) 
26010.1%
 
20110.1%
 
17810.1%
 
17010.1%
 
16910.1%
 
16510.1%
 
16410.1%
 
16310.1%
 
16010.1%
 
15910.1%
 

matriculados
Real number (ℝ≥0)

MISSING

Distinct628
Distinct (%)44.9%
Missing598
Missing (%)29.9%
Infinite0
Infinite (%)0.0%
Mean357.9007143
Minimum1
Maximum1572
Zeros0
Zeros (%)0.0%
Memory size15.7 KiB
2020-12-12T21:04:12.994811image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9.95
Q146
median244.5
Q3647
95-th percentile996.3
Maximum1572
Range1571
Interquartile range (IQR)601

Descriptive statistics

Standard deviation360.1806246
Coefficient of variation (CV)1.006370231
Kurtosis0.1284024813
Mean357.9007143
Median Absolute Deviation (MAD)217.5
Skewness0.9397729588
Sum501061
Variance129730.0823
MonotocityNot monotonic
2020-12-12T21:04:13.081386image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
3140.7%
 
22130.7%
 
31120.6%
 
50120.6%
 
24120.6%
 
12110.6%
 
48110.6%
 
2110.6%
 
23110.6%
 
52100.5%
 
38100.5%
 
27100.5%
 
190.5%
 
690.5%
 
5790.5%
 
2590.5%
 
1790.5%
 
3990.5%
 
1090.5%
 
3390.5%
 
3580.4%
 
3780.4%
 
4580.4%
 
1580.4%
 
2980.4%
 
Other values (603)115157.6%
 
(Missing)59829.9%
 
ValueCountFrequency (%) 
190.5%
 
2110.6%
 
3140.7%
 
470.4%
 
580.4%
 
690.5%
 
870.4%
 
950.3%
 
1090.5%
 
1140.2%
 
ValueCountFrequency (%) 
157220.1%
 
152010.1%
 
151510.1%
 
148810.1%
 
146110.1%
 
145910.1%
 
145110.1%
 
144010.1%
 
142410.1%
 
142210.1%
 

egresados
Real number (ℝ≥0)

MISSING

Distinct333
Distinct (%)30.0%
Missing888
Missing (%)44.4%
Infinite0
Infinite (%)0.0%
Mean113.1927928
Minimum1
Maximum796
Zeros0
Zeros (%)0.0%
Memory size15.7 KiB
2020-12-12T21:04:13.175467image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q115
median72
Q3174.75
95-th percentile367.65
Maximum796
Range795
Interquartile range (IQR)159.75

Descriptive statistics

Standard deviation122.9938995
Coefficient of variation (CV)1.086587728
Kurtosis3.326158519
Mean113.1927928
Median Absolute Deviation (MAD)65
Skewness1.638593901
Sum125644
Variance15127.49932
MonotocityNot monotonic
2020-12-12T21:04:13.263542image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1552.8%
 
2402.0%
 
4261.3%
 
3241.2%
 
6191.0%
 
8180.9%
 
15150.8%
 
5150.8%
 
7140.7%
 
14120.6%
 
12110.6%
 
10110.6%
 
9100.5%
 
24100.5%
 
42100.5%
 
13100.5%
 
2190.5%
 
4490.5%
 
1890.5%
 
1990.5%
 
6880.4%
 
3980.4%
 
2780.4%
 
4380.4%
 
9780.4%
 
Other values (308)73436.7%
 
(Missing)88844.4%
 
ValueCountFrequency (%) 
1552.8%
 
2402.0%
 
3241.2%
 
4261.3%
 
5150.8%
 
6191.0%
 
7140.7%
 
8180.9%
 
9100.5%
 
10110.6%
 
ValueCountFrequency (%) 
79610.1%
 
75510.1%
 
72010.1%
 
62110.1%
 
59810.1%
 
58910.1%
 
58410.1%
 
56810.1%
 
54810.1%
 
54020.1%
 

graduados
Real number (ℝ≥0)

MISSING

Distinct105
Distinct (%)9.0%
Missing831
Missing (%)41.6%
Infinite0
Infinite (%)0.0%
Mean29.50642674
Minimum1
Maximum210
Zeros0
Zeros (%)0.0%
Memory size15.7 KiB
2020-12-12T21:04:13.357623image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q110
median26
Q342
95-th percentile73
Maximum210
Range209
Interquartile range (IQR)32

Descriptive statistics

Standard deviation24.2470083
Coefficient of variation (CV)0.8217534613
Kurtosis5.47525606
Mean29.50642674
Median Absolute Deviation (MAD)16
Skewness1.622354206
Sum34434
Variance587.9174115
MonotocityNot monotonic
2020-12-12T21:04:13.443197image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1572.9%
 
3381.9%
 
7351.8%
 
2311.6%
 
8271.4%
 
19261.3%
 
11261.3%
 
6261.3%
 
30251.3%
 
33251.3%
 
10241.2%
 
5231.2%
 
36231.2%
 
24221.1%
 
9211.1%
 
16211.1%
 
34211.1%
 
22201.0%
 
23201.0%
 
29201.0%
 
27191.0%
 
35191.0%
 
15191.0%
 
37191.0%
 
13191.0%
 
Other values (80)54127.1%
 
(Missing)83141.6%
 
ValueCountFrequency (%) 
1572.9%
 
2311.6%
 
3381.9%
 
4140.7%
 
5231.2%
 
6261.3%
 
7351.8%
 
8271.4%
 
9211.1%
 
10241.2%
 
ValueCountFrequency (%) 
21010.1%
 
18310.1%
 
15310.1%
 
15110.1%
 
14310.1%
 
13710.1%
 
12410.1%
 
11810.1%
 
11410.1%
 
10710.1%
 

retirados
Real number (ℝ≥0)

MISSING

Distinct88
Distinct (%)9.1%
Missing1030
Missing (%)51.6%
Infinite0
Infinite (%)0.0%
Mean17.82541322
Minimum1
Maximum172
Zeros0
Zeros (%)0.0%
Memory size15.7 KiB
2020-12-12T21:04:13.534275image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median11
Q324
95-th percentile57.65
Maximum172
Range171
Interquartile range (IQR)20

Descriptive statistics

Standard deviation19.9961603
Coefficient of variation (CV)1.121778219
Kurtosis9.201080307
Mean17.82541322
Median Absolute Deviation (MAD)8
Skewness2.461404976
Sum17255
Variance399.8464269
MonotocityNot monotonic
2020-12-12T21:04:13.619849image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1834.2%
 
2804.0%
 
6502.5%
 
4452.3%
 
5432.2%
 
3402.0%
 
7341.7%
 
10341.7%
 
8291.5%
 
9271.4%
 
14261.3%
 
16261.3%
 
15261.3%
 
13261.3%
 
11251.3%
 
12241.2%
 
20221.1%
 
18160.8%
 
19150.8%
 
28150.8%
 
17140.7%
 
24140.7%
 
22130.7%
 
29130.7%
 
34120.6%
 
Other values (63)21610.8%
 
(Missing)103051.6%
 
ValueCountFrequency (%) 
1834.2%
 
2804.0%
 
3402.0%
 
4452.3%
 
5432.2%
 
6502.5%
 
7341.7%
 
8291.5%
 
9271.4%
 
10341.7%
 
ValueCountFrequency (%) 
17210.1%
 
14410.1%
 
13810.1%
 
12810.1%
 
10810.1%
 
10210.1%
 
10020.1%
 
9510.1%
 
9310.1%
 
9220.1%
 

Interactions

2020-12-12T21:04:05.313701image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:05.403278image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:05.489853image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:05.573925image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:05.655495image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:05.743571image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:05.830146image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:05.911216image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:05.996289image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:06.079860image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:06.157928image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:06.234994image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:06.309558image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:06.390128image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:06.473199image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:06.548764image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:06.628333image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:06.709903image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:06.788470image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:06.870041image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:06.947107image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:07.031179image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:07.114751image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:07.190316image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:07.268884image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:07.346451image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:07.421515image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:07.497581image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:07.572645image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:07.651213image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:07.728279image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:07.799841image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:07.875906image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:07.962981image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:08.047054image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:08.131126image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:08.211195image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:08.299270image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:08.383843image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:08.465914image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:08.551487image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:08.638562image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:08.722135image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:08.807208image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:08.887277image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:08.973851image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:09.059925image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:09.138993image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:09.222065image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:09.299131image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:09.371693image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:09.446758image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:09.518320image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:09.594885image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:09.670951image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:09.741011image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:09.816076image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:09.901149image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:09.981718image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:10.063289image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:10.142356image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:10.226429image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:10.310001image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:10.386066image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Correlations

2020-12-12T21:04:13.701420image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-12-12T21:04:13.825526image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-12-12T21:04:13.953136image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-12-12T21:04:14.085249image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-12-12T21:04:14.215862image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-12-12T21:04:10.558214image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:10.736868image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:10.873486image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-12T21:04:10.984081image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Sample

First rows

annosemestreanno_semestrefacultadproyectonivelmodalidadinscritosadmitidosprimiparosmatriculadosegresadosgraduadosretirados
02009101/01/2009 12:00:00 AMFACULTAD DE ARTES-ASABARTE DANZARIOPREGRADOARTESNaNNaNNaNNaNNaNNaNNaN
12009101/01/2009 12:00:00 AMFACULTAD DE ARTES-ASABARTES ESCENICASPREGRADOARTESNaNNaNNaN308.016.010.010.0
22009101/01/2009 12:00:00 AMFACULTAD DE ARTES-ASABARTES MUSICALESPREGRADOARTES280.039.037.0404.063.07.014.0
32009101/01/2009 12:00:00 AMFACULTAD DE ARTES-ASABARTES PLASTICAS Y VISUALESPREGRADOARTES425.039.037.0417.047.023.021.0
42009101/01/2009 12:00:00 AMFACULTAD DE ARTES-ASABESPECIALIZACION EN VOZ ESCENICAPOSGRADOARTESNaNNaNNaNNaNNaNNaNNaN
52009101/01/2009 12:00:00 AMFACULTAD DE ARTES-ASABMAESTRIA EN ESTUDIOS ARTISTICOSPOSGRADOMAESTRIANaNNaNNaNNaNNaNNaNNaN
62009101/01/2009 12:00:00 AMFACULTAD DE CIENCIAS Y EDUCACIONDOCTORADO EN ESTUDIOS SOCIALESPOSGRADODOCTORADONaNNaNNaNNaNNaNNaNNaN
72009101/01/2009 12:00:00 AMFACULTAD DE CIENCIAS Y EDUCACIONDOCTORADO INTERINSTITUCIONAL EN EDUCACIONPOSGRADODOCTORADONaNNaNNaN33.0NaNNaNNaN
82009101/01/2009 12:00:00 AMFACULTAD DE CIENCIAS Y EDUCACIONESPECIALIZACION EN CIENCIAS DE LA EDUCACION CON ENFASIS EN PSICOLINGUISTICAPOSGRADOESPECIALIZACIONNaNNaNNaNNaNNaNNaNNaN
92009101/01/2009 12:00:00 AMFACULTAD DE CIENCIAS Y EDUCACIONESPECIALIZACION EN DESARROLLO HUMANO CON ENFASIS EN PROCESOS AFECTIVOS Y CREATIVIDADPOSGRADOESPECIALIZACIONNaN22.020.055.0NaN2.0NaN

Last rows

annosemestreanno_semestrefacultadproyectonivelmodalidadinscritosadmitidosprimiparosmatriculadosegresadosgraduadosretirados
19882017303/01/2017 12:00:00 AMFACULTAD DE TECNOLOGIA - POLITECNICA / TECNOLOGICAMAESTRIA EN INGENIERIA CIVILPOSGRADOMAESTRIANaNNaNNaNNaNNaNNaNNaN
19892017303/01/2017 12:00:00 AMFACULTAD DE TECNOLOGIA - POLITECNICA / TECNOLOGICATECNOLOGIA EN CONSTRUCCIONES CIVILESPREGRADOTECNOLOGIA599.0146.0119.0672.080.042.02.0
19902017303/01/2017 12:00:00 AMFACULTAD DE TECNOLOGIA - POLITECNICA / TECNOLOGICATECNOLOGIA EN ELECTRICIDADPREGRADOTECNOLOGIANaNNaNNaN181.029.034.0NaN
19912017303/01/2017 12:00:00 AMFACULTAD DE TECNOLOGIA - POLITECNICA / TECNOLOGICATECNOLOGIA EN ELECTRONICAPREGRADOTECNOLOGIA191.0146.0117.0754.014.036.0NaN
19922017303/01/2017 12:00:00 AMFACULTAD DE TECNOLOGIA - POLITECNICA / TECNOLOGICATECNOLOGIA EN GESTION DE LA PRODUCCION INDUSTRIALPREGRADOTECNOLOGIA285.0156.0112.0897.06.055.0NaN
19932017303/01/2017 12:00:00 AMFACULTAD DE TECNOLOGIA - POLITECNICA / TECNOLOGICATECNOLOGIA EN MECANICA INDUSTRIALPREGRADOTECNOLOGIA219.0138.0119.0839.0148.071.0NaN
19942017303/01/2017 12:00:00 AMFACULTAD DE TECNOLOGIA - POLITECNICA / TECNOLOGICATECNOLOGIA EN SISTEMAS ELECTRICOS DE MEDIA Y BAJA TENSIONPREGRADOTECNOLOGIA92.078.059.0228.0NaNNaNNaN
19952017303/01/2017 12:00:00 AMFACULTAD DE TECNOLOGIA - POLITECNICA / TECNOLOGICATECNOLOGIA EN SISTEMATIZACION DE DATOSPREGRADOTECNOLOGIA171.0145.0118.0774.014.034.0NaN
19962017303/01/2017 12:00:00 AMVICERRECTORIA ACADEMICANO_REGISTRAPOSGRADOPROYECTO ACADEMICONaNNaNNaNNaNNaNNaNNaN
19972017303/01/2017 12:00:00 AMVICERRECTORIA CONVENIOSARTES ESCENICASPREGRADOARTESNaNNaNNaNNaNNaNNaNNaN